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(54) Title: A VOICE RESPONSE UNIT WITH A VISUAL MENU INTERFACE 
(57) Abstract 

An enhanced response unit that is able to send visual information to 
the customer as well as voice information, thereby providing the customer 
with a powerful and satisfying mechanism with which to achieve the 
desired result. The unit determines whether the CPE calling the unit is 
capable of interacting with visual information and adjusts its mode of 
operation accordingly. Because two channels of information are available, 
various capabilities are easily implemented such as interacting with users in 
visual mode and in aural mode simultaneuously, such as recalling the visual 
menu process anytime, etc. The improved unit also allows for convenient 
linking, in response to an appropriate choice made by the customer, to a 
physically different response unit that provides its own menu. 
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A VOICE RESPONSE UNIT WITH A VISUAL MENU INTERFACE 

Background of the Invention 

This invention relates to telecommunication equipment and services and, 

5 more particularly, to automated response units. 

Voice response units (VRUs) are automated response apparatus to which 
a telecommunication customer can connect to obtain a particular service or to 
speak to a particular person. The customer is greeted by a sequence of 
electronically generated prompts that, through the interactive responses from the 

l o customer, eventually connect the customer to the desired service or person. 
Typically, the customer's response signal is a DTMF signal that results from 
pressing one of the touch tone pad buttons. Some of the more sophisticated 
VRUs in the network today respond to spoken words that either correspond to 
the touch tone pad buttons that need to be pressed (e.g., "four") or actually 

1 5 correspond to the meaning of the spoken words (e.g., "collect"). 

The VRU interaction with the customer is limited, however, because in 
today's VRUs the method for "signaling" by the VRU (i.e., the prompts sent to 
the customer) is by means of spoken words, and people have a fairly limited 
capacity when it comes to hearing, comprehending, remembering and thereafter 

20 responding to a set of prompts or instructions. The result is that VRUs typically 
present customers with very few choices in each of their prompts, relying on a 
hierarchical approach to the prompts and answers that lead customers to the 
desired state. For example, a store may choose to have the first question resolve 
the nature of the customer's business with the store. Hence, the first prompt may 

25 be "For questions about a bill, press 1 ; for questions about our sale, press 2; for 
questions about a recent purchase, press 3; and if you wish to be directed to a 
particular department, press 4." Once the first question is answered, the store 
may wish the customer to advance to the next hierarchical level. For example, 
"For automotive, press 1; for garden supplies, press 2; for clothing, press 3, and 

30 for furniture, press 4." 
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It is readily apparent that while this mode of interaction is very powerful 
because it allows the store to electronically (and hence inexpensively) handle 
fairly complex interactions with customers before a person needs to be involved 
(if at all), it is often frustrating to customers. The frustration arises from 
situations where the customer needs to "back up" and cannot, when the prompt is 
too long or too complex for the customer to comprehend and remember, when 
the customer is guided through too long a sequence of prompts (and hence too 
time consuming), when even after going through the long sequence of prompts 
the choice that the customer wishes to make is not offered in any of the prompts 
(e.g., the customer wishes to talk to someone in cosmetics, and not in 
automotive, garden supplies, clothing or furniture, as in the above example), etc. 

What is needed is a better interface, and I believe that a visual interface 
holds such a promise. 

In a somewhat different context, Bellcore has introduced a 
communication protocol that provides for bi-directional transmission of data 
between a stored program control system (SPCS) and specialized customer 
premises equipment (that includes a display screen), which Bellcore terms 
"analog display services customer premises equipment." This protocol, which is 
commonly referred to as ADSI (for Analog Display Services Interface), is 
described in Bellcore's publication titled "Generic Requirements for an SPCS to 
Customer Premises Equipment Data Interface for Analog Display Services," TR- 
NWT-001273, December 1992. The SPCS is connected, directly or remotely, to 
the CPE. The remote connection may be via the Public Switched Telephone 
Network (PSTN). According to the Bellcore-proposed ADSI protocol, the SPCS 
hosts/servers must meet a number of requirements, and among them are: 

• The SPCS must be able to generate CPE Alerting Signals (ringing 
signals); 

• The SPCS must be able to provide standard dial tone; 

• The SPCS must be able to receive standard DTMF signaling; 

• The SPCS must be able to turn off the DTMF receiver; and others. 
Those requirements are not compatible with VRUs. 



WO 97/50236 




PCT/US97/06341 



From the above it can be seen that the ADSI protocol aims at providing a 
limited visual messaging capability to a specialized CPE from a SPCS system 
that provides the alerting and the dial tone signals to the CPE. PBXs and central 
offices are such systems. One application for which this capability is apparently 
5 aimed is the "voice mail" services that are offered in PBXs or central offices. 

A viable extension for VRUs is necessary which provides for interactive 
operation across the telecommunication network, which eliminates the 
limitations imposed by the ADSI on the SPCS and CPE equipment, which 
eliminates the disadvantages of today's VRU-customer interfaces, and which is 
10 robust enough to be acceptable to today's telecommunication network. 

Summary 

The present day limitations of VRUs are overcome by providing a 
mechanism for enhanced interaction between the VRU and the customer. In 

15 accordance with the principles disclosed herein, the VRU becomes a Multimedia 
Response Unit (MRU) which can signal the customer with more than just voice. 
In particular, the MRU is able to send visual information to the customer as well 
as voice information, thereby providing the customer with a powerful and 
satisfying mechanism with which to achieve the desired result. When interacting 

20 with video, the structure of the menu is revealed to the customers; the customer 
is allowed to see at any one time more than one level of hierarchical menu 
structures (when the menu has such a structure), is allowed to skip levels, is 
allowed to back up, is allowed to skip form one "branch" of the tree to another, 
and can be simultaneously provided with additional information. In 

25 embodiments that provide the additional information, it may be imparted through 
the voice channel or though the video channel, or both. The video channel, for 
example, can be used to display advertising information in areas that are not used 
to display the menu. Separate advertising screens between successive menu 
screens can easily be incorporated. 

30 In accordance with one feature, the MRU is informed of the type of CPE 

that is connected to the MRU, and the MRU adjusts its mode of operation 
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accordingly. This allows the MRU to interact with all customers regardless of 
the particular terminal equipment that the customer possesses, which makes the 
MRU acceptable to today's telecommunication environment 

In accordance with another feature, even after the initial interaction is 
5 completed and the party calling the MRU conducts the substantive business that 
prompted the call in the first instance, the digital channel can continue to be 
used, and a return to the menu is allowed anytime. 

In accordance with still another feature, the hierarchical menu structure 
allows for convenient linking to a physically different MRU, or VRU, that 
10 provides its own menu. 

Brief Description of the Drawing 

FIG. 1 presents a block diagram of an arrangement that supports a visual 
menu presentation to a customer premises equipment; 
15 FIG. 2 illustrates a menu arrangement that comports with the FIG. 1 

arrangement; and 

FIG. 3 provides a general flow chart of the processes carried out in the 

MRU. 

20 Detailed Description 

In accordance with the principles disclosed herein and explained more 
fully below by way of an illustrative embodiment, the options or selections 
presented to customers are in visual form, in aural form, or in both visual and 
aural form. The enhanced mode of operation (i.e., providing visual information) 

25 is possible, however, only in connection with customer terminals that can handle 
visual information. Recognizing that most present day terminal units are 
conventional telephones that are limited to aural communication, the MRU 
embodiment as disclosed herein also offers a voice-only mode of operation as 
well as the enhanced visual mode of operation. Advantageously, such a system 

30 automatically discerns the type of terminal with which it is communicating and 
thereafter employs the appropriate mode of communication. 
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Most telephones with a visual display, or screen, merely employ the 
screen as a means for communicating to the user from other than the party at the 
other end of the conversation to which the telephone is connected. The 
communication is sometimes from the terminal itself (such as when the digits 

5 dialed by the user are displayed on the screen), from the PBX to which the user is 
connected (such as providing means for obtaining the extension numbers of other 
customers connected to the PBX), or from the central office (such as when the 
caller ID information is provided). In contradistinction, the arrangement 
disclosed below provides a communication channel from the CPE (e.g., 

10 telephone), through the telecommunication network, to the party at the other end. 
In the context discussed herein, since MRUs are not used for originating calls, 
the "party at the other end" is the called party. This communication channel may 
be embedded in the voice band (commingled with voice communication but in a 
separate logical channel) or it can be realized through a separate channel. For 

1 5 example, when the CPE is ISDN compatible (e.g., an ISDN phone) and the MRU 
is ISDN compatible, both data and control can be communicated. 

FIG. 1 presents an arrangement where an MRU 10 unit is connected via 
telecommunication network 20 to a visual telephone 30. For illustrative 
purposes, telephone 30 is embodied through a combination that includes a 

20 simultaneous voice and data modem 3 1 , a conventional telephone 32 connected 
to the voice input port of modem 3 1 , and a general purpose computer that 
includes a processor 33 that is connected to the data port of modem 3 1 . The 
remaining elements of the computer are memory 34, a video display 35, and a 
keyboard/mouse 36 that are all coupled to processor 33. As in conventional PCs, 

25 memory 34 stores data and programs that control the operation of processor 33 
as well as the information that is displayed on video display 35. Processor 33 is 
also coupled to a control port of modem 3 1 , to control the modem's operations, 
and to the analog input of modem 3 1, to sense the state of the line (specifically, 
to determine whether conventional telephone 32 is "off hook" or "on hook"). 

30 Modem 31 is a "simultaneous voice and data" modem of the type described, for 
example, in U.S. Patent No. 5,440, and the arrangement of conventional 
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telephone 32 and processor 33 generally follows the disclosure in the above 
mentioned patent, particularly with reference to FIG. 30 therein. Modem 3 1 
provides for two independent (logical) channels over the voice band, with one 
adapted for voice transmission and the other adapted for data transmission. 
5 The combination making up visual telephone 30 can signal network 20 

(e.g., dial out) both through conventional telephone 32 or through the computer. 
When signaling through the computer, the operation is identical to that of a 
conventional modem connected to computers and operating under control of 
communication software. For dialing out via telephone 32, processor 33 needs 

10 to detect that telephone 32 goes "off hook," cause the modem to go "off hook" in 
its interface to the CO in network 20, and cause the DTMF signals dialed by 
telephone 32 to appear at the output of modem 3 1 . This can be achieved by 
simply switching the output of modem 3 1 to its voice input port or, alternatively, 
processor 33 can detect the DTMF signals emanating from conventional 

15 telephone 32 and repeat them (in proper format) to modem 31 which continues 
to operate in the conventional manner as described above. (It may be noted in 
passing that the signal on line 33 1 goes through an analog to digital conversion 
before it is processed by processor 33. The A/D board which does the 
conversion and which is associated with processor 33 is not shown for sake of 

20 simplicity.) 

The first challenge for MRU 10 is to ascertain whether the signal it 
receives emanates from a CPE with a visual display that can be accessed by 
MRU 10, such as visual telephone 30, or emanates from a CPE that does not 
have an accessible visual display, such as conventional telephone 32. This 

25 challenge is readily met by MRU 10 when a hailing signal is placed by processor 
33 on the data channel. 

A conventional VRU includes an alert detection circuit that responds to 
an alert signal (e.g., ringing) on an incoming line. It informs a controller of this 
condition and, under command of the controller, an "off hook" condition is 

30 effected on the alerting line. The controller then activates the prompts program, 
and communication proceeds. The very same arrangement is created in MRU 
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10, with alert detector 1 1 coupled to the incoming line and to controller 100, 
which controls switch 13. The signal from switch 13 is coupled to voice 
interface unit 14 which is coupled to processor 1 5 and to switch 16 that operates 
under control of processor 15. Storage element 17 is coupled to switch 16 and is 

5 controlled by processor 15. In MRU 10, additional circuitry is provided to 
identify the type of instrument that is communicating with the MRU, and that 
includes a simultaneous voice and data modem 12 that is coupled to switch 13 
and to controller 100. When the communicating CPE is a visual telephone 30, 
the data output of modem 12 contains the hailing signal sent by processor 33. 

10 The presence of this signal, detected by processor 15, clearly indicates that the 
MRU is communicating with a CPE which includes an accessible display. 
Otherwise, controller 100 concludes that the communicating CPE does not have 
an accessible display. The determinations made by controller 100 direct the 
execution of different processes. 

1 5 It should be observed that the MRU allows a number of different 

elements to be connected in parallel to switch 13 and coupled to controller 100, 
each effectively tuned to a different type of CPE. Such a capability results in a 
very robust MRU that can be installed immediately in the telecommunication 
network, even before any particular communication standard is developed. 

20 Having established a digital communication path between the CPE and 

the MRU, the task of providing a visual prompt menu to the CPE reduces to a 
selection of a protocol and of a choice and design of the menus themselves. An 
advantageous protocol would be one that allows response signals to emanate 
from both conventional telephone 32 or from processor 33 (via keyboard/mouse 

25 element 36). On the digital side, a protocol of the type used in today's Internet 
environment is well suited. Hyper Text Markup Language (html) is used to 
construct menus, and appropriate software running within processor 33 interprets 
and displays the menus on screen 35. By clicking on particular areas on the 
screen (with element 36), a signal is developed and sent to the MRU which 

30 controller 1 00 interprets and executes. 
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As an aside, while the above speaks of "areas" it is understood that such 
areas can be populated by words, icons, or pictures, in a manner not unlike that 
currently used in Internet applications. It should also be understood that the 
above described structure allows effectively skipping from one branch of the 

5 hierarchical tree to another, because there need not be any limitations on what 
displayed item is clicked on. 

Information in addition to the visual choices presented can be sent by the 
MRU via the voice channel. This too can comprise information/instructions that 
are related to the displayed menus, or can be unrelated to the menus. 

1 o As indicated above, it is advantageous to allow customers to respond via 

the keyboard/mouse 36 or via the conventional telephone 32. This is 
accomplished by having the displayed menus inform the customers how they can 
respond (either by speaking or by pressing a directed one of the telephone push 
buttons) and, at the MRU, by having processor 15 be responsive to not only the 

1 5 data from the digital channel but also to information from the voice channel. 
The path between voice interface 14 and processor 15 allows for such response 
menus. 

FIG. 3 presents a general block diagram that highlights the novel 
processes in the FIG. 1 MRU. It begins with block 301 that detects the presence 

20 of an alert signal on an MRU's incoming line. When the alert signal is detected, 
control passes to block 302 causing the MRU to go "off hook" (closing switch 
13 in FIG. 1). At this point, modem 12 in FIG. 1 detects the hailing signal on the 
incoming line, when one exists, and provides that information to controller 100. 
In accordance with block 303, a determination is made whether the CPE supports 

25 visual screens and control passes to block 304 when block 303 determines that 
the CPE does support visual screens, and to block 3 1 3 when block 303 
determines that the CPE does not support visual screens. When control passes to 
block 313, the conventional aural prompts process proceeds, and at its 
conclusion control passes to block 310, where the business in connection with 

30 which the call was made to the MRU is conducted. 
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When control passes to block 304, an initial screen is selected by 
processor 15 in FIG. I and, pursuant to block 305, the signals corresponding to 
the selected screen are sent by the MRU (via modem 12 within the MRU) to the 
CPE. At this point, the MRU waits for a response from visual telephone 30. In 
5 accordance with the principles disclosed herein, that response can be digital or 
analog; that is, it can arrive at MRU 10 via the analog channel or the digital 
channel. Block 306 detects the arrival of responses via the digital channel. 
When no such responses are detected, block 307 detects the arrival of responses 
via the analog channel. A received response is evaluated in block 308 to 

10 determine whether it is a "final" response of the presented menu or a "non-final' 1 
response of the menu. A final response is one that informs the MRU of the 
specific request made by visual telephone 30. A non-final response is one that 
requests the MRU to provide more information, via a subsequent screen. At 
such a circumstance, control passes to block 309, which selects the next screen 

15 to be provided by the MRU and passes control to block 305 which forwards that 
next screen to the CPE. 

It may be noted that block 309 functionally merely provides another 
screen to the CPE. In terms of what it actually does, it should be realized that it 
can perform two quite different actions. In the straight forward situation, 

20 processor 1 5 in MRU 10 of FIG. 1 changes the contents of the screen, and that 
change may be relatively simple. For example, the change may simply be 
moving that which is described in area 202 into area 201 and displaying new 
information in areas 202 and 203. A more interesting condition occurs when a 
change in display requires access to another piece of apparatus, which perhaps is 

25 another MRU. For example, assuming that the screen of FIG. 2 relates to 

purchasing of airline tickets, and further assuming that menu item 2021 asks the 
user whether he or she wishes to rent a car, a situation can occur that the CPE 
would select menu item 2021 or select another menu item. When such other 
menu item is selected, processor 1 5, which is part of MRU 10 that is controlled 

30 by an airline company, can create the next screen on its own. When menu item 
2021 is selected, processor 15 does not have sufficient information. To obtain 
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the necessary additional information, processor 15 can access another device 
(perhaps another MRU) through switch 16 and obtain the needed information to 
provide the display. In the alternative, processor 15 can effectively switch the 
call, or transfer control, to such other MRU, and have the CPE interact with such 

5 other MRU directly and obtain the screens generated by that MRU. 

When block 308 determines that the selection made by the CPE is a 
"final" selection, control passes to block 310. FIG. 3 indicates that block 310 
passes control to block 3 1 1 and that block 3 1 1 passes control to block 312. This 
is a diagrammatic representation of an arrangement where, during the course of 

10 conducting business, processor 15 continues to be sensitive to inputs from the 
CPE. When an input is received that corresponds to a CPE request to return to 
the menu, control is transferred from block 3 10 to block 3 1 1 and, therefrom, to 
block 312. The request to return to menu can occur when the CPE determines 
that an erroneous menu selection was made, can occur at the end of the business 

15 conducted (starting a new business transaction), can be initiated by the CPE, can 
be initiated by MRU 10 itself, or it can be initiated by equipment connected to 
MRU 10 via switch 16 (e.g., a telephone activated by an operator). If the 
transaction is completed and a request to return to the menu is not made, then the 
process exits. 

20 Block 312 determines the particular screen that should be provided to the 

CPE and thereafter passes control to block 309. The screen selected by block 
312 may be the initial screen, but it doesn't have to be. In fact, it can be 
dependent on the state in which block 310 was at when the escape to menu is 
detected. 

25 It may be noted that since processor 1 5 continues to be sensitive to 

outputs from modem 12, the conducting of business carried out in block 310 
need not be simply aural in nature. That is, it can include interactions to the 
screen and from the screen (assuming, of course, that the CPE supports visual 
screens, and processor 15 has the information to make that determination). 



30 
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Claims: 

L Telecommunication apparatus responsive to an incoming call initiated 
by a party and arriving at a port of the apparatus, the apparatus providing aural 
5 prompts to the party that identify choices and receiving responsive signals 
indicative of choice selections, the improvement comprising: 

a detector (1 1) responsive to the incoming call; 

a memory (17), storing signals corresponding to visual prompts, where 
each prompt identifies choices; and 
10 a processor (15), coupled to the memory and to the port, supplying the 

visual prompts to said port when the detector determines that the incoming call is 
adapted to accept visual signals, and supplying only aural prompts when the 
detector determines that the incoming call is not adapted to accept visual signals. 

15 2. The apparatus of claim 1 further comprising a switch responsive to 

said responsive signals for connecting the incoming call to one of a plurality of 
output ports and establishing a voice communication path between the one 
output port and the party. 

20 3. The apparatus of claim 2 where the path between the one output port 

and the party also includes a data communication channel. 

4. The apparatus of claim 1 where the memory contains signals that form 
at least one screen image, and the image comprises signals that correspond to a 
25 list of menu items. 



5. The apparatus of claim 4 where associated with the signals of each 
menu item there is a signal related to the selection of the menu item. 
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6. The apparatus of claim 1 where the memory contains signals that form 
at least one screen image, and the image comprises a multi-level hierarchical 
menu structure, with at least two levels. 

5 7. The apparatus of claim 6 where each level comprises menu items, 

where at least one level comprises a list of menu items indicative of choice 
selections. 

8. The apparatus of claim 7 where at least one menu item of at least one 
10 level represents a request to display different levels of the menu structure. 

9. The apparatus of claims 4 or 6 where the screen further comprises 
information areas. 

15 10. A method carried out in an automated response unit that includes an 

aural prompt process of providing to a party that initiates a call to the unit an 
interaction session, where the session includes a sequence of aural prompts from 
the unit and corresponding responses from the party, and where each prompt 
suggests choices to be made and each response indicates a choice selection, the 

20 improvement comprising the steps of: 

determining (303) whether the party is communicating with the unit via 
apparatus that accepts visual signals; 

initiating (313)said aural prompt process when said step of determining 
concludes that the apparatus does not accept visual signals; and 

25 executing (304-3 12) a visual prompt process when said step of 

determining concludes that the apparatus accepts visual signals, where the visual 
prompt process includes an interaction session that includes a sequence of 
presenting an image to the apparatus, where the image includes choices to be 
made, and of receiving from the apparatus a signal that indicates a choice 

30 selection. 
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11. The method of claim 10 where the choices in the image are arranged 
in a hierarchical manner. 

12. The method of claim 11 where the hierarchical order is reflected in 
levels of choices, and where some of the choices indicate a request to display 
another image and other choices indicate a request to take an action. 

13. The method of claim 12 where the action to be taken comprises a 
separate visual prompt process for obtaining information from the party. 

14. The method of claim 12 where the action to be taken comprises 
connecting the party to an operator. 

15. The method of claim 14 where at any time in the course of 
interaction with the operator, a preselected control signal received by the 
automated response unit terminates the separate visual prompt process and re- 
executes the visual prompt process of claim 10. 

16. The method of claim 10 where one of the choices presented by the 
image is a request to display a subsequent image and, in response to a signal 
from the apparatus indicating selection of this choice, the visual prompt process 
initiates another sequence that presents a subsequent image to the apparatus, 
where the subsequent image includes choices to be made, and receives from the 
apparatus a signal that indicates a choice selection. 

17. The method of claim 10 where the image includes choices to be 
made and information conveyed. 



18. The method of claim 17 where the information is related to the 
30 choices to be made. 



WO 97/50236 




PCT/US97/06341 



19. The method of claim 17 where the information comprises additional 
instructions. 

20. The method of claim 17 where the information comprises advertising 

5 images. 

21. The method of claim 10 where at least one of the choices causes the 
automated response unit to connect the apparatus to a service provider unit. 

10 22. The method of claim 21 where the service provider unit is an 

automated response unit. 

23. The method of claim 22 where the automated response unit, in 
association with the connection of the apparatus to the service provider unit, 
1 5 instructs the service provider unit to execute a sequence comprising sending to 
the apparatus at least one image that includes choices to be made and receiving 
from the apparatus a signal that indicates a choice selection. 
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